Rethink Training of BERT Rerankers in Multi-stage Retrieval Pipeline
نویسندگان
چکیده
Pre-trained deep language models (LM) have advanced the state-of-the-art of text retrieval. Rerankers fine-tuned from LM estimates candidate relevance based on rich contextualized matching signals. Meanwhile, LMs can also be leveraged to improve search index, building retrievers with better recall. One would expect a straightforward combination both in pipeline additive performance gain. In this paper, we discover otherwise and that popular reranker cannot fully exploit improved retrieval result. We, therefore, propose Localized Contrastive Estimation (LCE) for training rerankers demonstrate it significantly improves two-stage (Our codes are open sourced at https://github.com/luyug/Reranker .).
منابع مشابه
Training Better CNNs Requires to Rethink ReLU
With the rapid development of Deep Convolutional Neural Networks (DCNNs), numerous works focus on designing better network architectures (i.e., AlexNet, VGG, Inception, ResNet and DenseNet etc.). Nevertheless, all these networks have the same characteristic: each convolutional layer is followed by an activation layer, a Rectified Linear Unit (ReLU) layer is the most used among them. In this wor...
متن کاملDynamic Trade-Off Prediction in Multi-Stage Retrieval Systems
Modern multi-stage retrieval systems are comprised of a candidate generation stage followed by one or more reranking stages. In such an architecture, the quality of the final ranked list may not be sensitive to the quality of initial candidate pool, especially in terms of early precision. This provides several opportunities to increase retrieval efficiency without significantly sacrificing effe...
متن کاملFuzzy retrieval of encrypted data by multi-purpose data-structures
The growing amount of information that has arisen from emerging technologies has caused organizations to face challenges in maintaining and managing their information. Expanding hardware, human resources, outsourcing data management, and maintenance an external organization in the form of cloud storage services, are two common approaches to overcome these challenges; The first approach costs of...
متن کاملMulti-Stage DNN Training for Automatic Recognition of Dysarthric Speech
Incorporating automatic speech recognition (ASR) in individualized speech training applications is becoming more viable thanks to the improved generalization capabilities of neural network-based acoustic models. The main problem in developing applications for dysarthric speech is the relative in-domain data scarcity. Collecting representative amounts of dysarthric speech data is difficult due t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Lecture Notes in Computer Science
سال: 2021
ISSN: ['1611-3349', '0302-9743']
DOI: https://doi.org/10.1007/978-3-030-72240-1_26